Survey of Apache spark optimized job scheduling in big data

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Static and Dynamic Big Data Partitioning on Apache Spark

Many of today’s large datasets are organized as a graph. Due to their size it is often infeasible to process these graphs using a single machine. Therefore, many software frameworks and tools have been proposed to process graph on top of distributed infrastructures. This software is often bundled with generic data decomposition strategies that are not optimised for specific algorithms. In this ...

متن کامل

A Survey on Big Data Management and Job Scheduling

Big data has gained its popularity in the recent years due to the fact that there is a need for sophisticated method to collect, process, analyze and visualize huge volumes of data generated by our digital and computing world. Several challenges in handling petabytes of information, commonly named as Big data needs to be addressed in more efficient way. Big data management (BDM) is the process ...

متن کامل

A comparison on scalability for batch big data processing on Apache Spark and Apache Flink

*Correspondence: [email protected] 1Department of Computer Science and Artificial Intelligence, CITIC-UGR (Research Center on Information and Communications Technology), University of Granada, Calle Periodista Daniel Saucedo Aranda, 18071 Granada, Spain Full list of author information is available at the end of the article Abstract The large amounts of data have created a need for new fram...

متن کامل

An Information Theoretic Feature Selection Framework for Big Data under Apache Spark

With the advent of extremely high dimensional datasets, dimensionality reduction techniques are becoming mandatory. Among many techniques, feature selection has been growing in interest as an important tool to identify relevant features on huge datasets –both in number of instances and features–. The purpose of this work is to demonstrate that standard feature selection methods can be paralleli...

متن کامل

Optimized Thermal-Aware Job Scheduling and Control of Data Centers

Analyzing data centers with thermal-aware optimization techniques is a viable approach to reduce energy consumption of data centers. By taking into account thermal consequences of job placements among the servers of a data center, it is possible to reduce the amount of cooling necessary to keep the servers below a given safe temperature threshold. We set up an optimization problem to analyze an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Industry and Sustainable Development

سال: 2020

ISSN: 2682-4000

DOI: 10.21608/ijisd.2020.73486